Search CORE

11 research outputs found

Outlier Detection in Heterogeneous Datasets using Automatic Tuple Expansion

Author: Harding Rachael
Madden Sam
Mariet Zelda
Pit-Claudel Clément
Publication venue
Publication date: 10/02/2016
Field of study

Rapidly developing areas of information technology are generating massive amounts of data. Human errors, sensor failures, and other unforeseen circumstances unfortunately tend to undermine the quality and consistency of these datasets by introducing outliers -- data points that exhibit surprising behavior when compared to the rest of the data. Characterizing, locating, and in some cases eliminating these outliers offers interesting insight about the data under scrutiny and reinforces the confidence that one may have in conclusions drawn from otherwise noisy datasets. In this paper, we describe a tuple expansion procedure which reconstructs rich information from semantically poor SQL data types such as strings, integers, and floating point numbers. We then use this procedure as the foundation of a new user-guided outlier detection framework, dBoost, which relies on inference and statistical modeling of heterogeneous data to flag suspicious fields in database tuples. We show that this novel approach achieves good classification performance, both in traditional numerical datasets and in highly non-numerical contexts such as mostly textual datasets. Our implementation is publicly available, under version 3 of the GNU General Public License

DSpace@MIT

Meta-F*: Proof Automation with SMT, Tactics, and Metaprograms

Author: Ahman Danel
Dumitrescu Victor
Giannarakis Nick
Hawblitzel Chris
Hritcu Catalin
Martínez Guido
Narasimhamurthy Monal
Paraskevopoulou Zoe
Pit-Claudel Clément
Protzenko Jonathan
Ramananandro Tahina
Rastogi Aseem
Swamy Nikhil
Publication venue
Publication date: 07/03/2019
Field of study

We introduce Meta-F*, a tactics and metaprogramming framework for the F* program verifier. The main novelty of Meta-F* is allowing the use of tactics and metaprogramming to discharge assertions not solvable by SMT, or to just simplify them into well-behaved SMT fragments. Plus, Meta-F* can be used to generate verified code automatically. Meta-F* is implemented as an F* effect, which, given the powerful effect system of F*, heavily increases code reuse and even enables the lightweight verification of metaprograms. Metaprograms can be either interpreted, or compiled to efficient native code that can be dynamically loaded into the F* type-checker and can interoperate with interpreted code. Evaluation on realistic case studies shows that Meta-F* provides substantial gains in proof development, efficiency, and robustness.Comment: Full version of ESOP'19 pape

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Compilation using correct-by-construction program synthesis

Author: Pit-Claudel Clément
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2016
Field of study

Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (pages [131]-134).Extracting and compiling certified programs may introduce bugs in otherwise proven-correct code, reducing the extent of the guarantees that proof assistants and correct-by-construction program-derivation frameworks provide. We explore a novel approach to extracting and compiling embedded domain-specific languages developed in a proof assistant (Coq), showing how it allows us to extend correctness guarantees all the way down to a verification-aware assembly language. Our core idea is to phrase compilation of shallowly embedded programs to a lower-level deeply embedded language as a synthesis problem, solved using simple proof-search techniques. This technique is extensible (support for individual language constructs is provided by a user-extensible database of compilation tactics and lemmas) and allows the source programs to depend on axiomatically specified methods of externally implemented data structures, delaying linking to the assembly stage. Composed with the Fiat and Bedrock frameworks, our new method provides the first proof-generating automatic translation from SQL-style relational programs into executable assembly code.by Clément Pit-Claudel.S.M

DSpace@MIT

Relational compilation: Functional-to-imperative code generation for performance-critical applications

Author: Pit-Claudel Clément
Publication venue: Massachusetts Institute of Technology
Publication date: 04/03/2022
Field of study

Purely functional programs verified using interactive theorem provers typically need to be translated to run: either by extracting them to a similar language (like Coq to OCaml) or by proving them equivalent to deeply embedded implementations (like C programs). Traditionally, the first approach is automated but produces unverified programs with average performance, and the second approach is manual but produces verified, high-performance programs. This thesis shows how to recast program extraction as a proof-search problem to automatically derive correct-by-construction, high-performance code from shallowly embedded functional programs. It introduces a unifying framework, relational compilation, to capture and extend recent developments in program extraction, with a focus on modularity and sound extensibility. To demonstrate the value of this approach, it then presents Rupicola, a relational compiler-construction toolkit designed to extract fast, verified, idiomatic low-level code from annotated functional models. The originality of this approach lies in its combination of foundational proofs, extensibility, and performance, backed by an unconventional take on compiler extensions: unlike traditional compilers, Rupicola generates good code not because of clever built-in optimizations, but because it allows expert users to plug in domain- and sometimes program-specific extensions that allow them to generate exactly the low-level code that they want. This thesis demonstrates the benefits of this approach through case studies and performance benchmarks that highlight how easy Rupicola makes it to create domain-specific compilers that generate code with performance comparable to that of handwritten C programs.Ph.D

DSpace@MIT

Trigger Selection Strategies to Stabilize Program Verifiers

Author: Leino K. Rustan M.
Pit-Claudel Clément
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2016
Field of study

SMT-based program verifiers often suffer from the so-called butterfly effect, in which minor modifications to the program source cause significant instabilities in verification times, which in turn may lead to spurious verification failures and a degraded user experience. This paper identifies matching loops (ill-behaved quantifiers causing an SMT solver to repeatedly instantiate a small set of quantified formulas) as a significant contributor to these instabilities, and describes some techniques to detect and prevent them. At their core, the contributed techniques move the trigger selection logic away from the SMT solver and into the high-level verifier: this move allows authors of verifiers to annotate, rewrite, and analyze user-written quantifiers to improve the solver’s performance, using information that is easily available at the source level but would be hard to extract from the heavily encoded terms that the solver works with. The paper demonstrates three core techniques (quantifier splitting, trigger sharing, and matching loop detection) by extending the Dafny verifier with its own trigger selection routine, and demonstrates significant predictability and performance gains on both Dafny’s test suite and large verification efforts using Dafny

DSpace@MIT

Coq library for formal topology

Author: Benjamin Sherman
Clément Pit-Claudel
Luke Sciarappa
Publication venue
Publication date
Field of study

Version of the Coq formal topology library at the time of submission of my Master's thesis

ZENODO

The essence of Bluespec: a core language for rule-based hardware design

Author: Arvind Arvind
Bourgeat Thomas
Chlipala Adam
Pit-Claudel Clément
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/07/2021
Field of study

The Bluespec hardware-description language presents a significantly higher-level view than hardware engineers are used to, exposing a simpler concurrency model that promotes formal proof, without compromising on performance of compiled circuits. Unfortunately, the cost model of Bluespec has been unclear, with performance details depending on a mix of user hints and opaque static analysis of potential concurrency conflicts within a design. In this paper we present Koika, a derivative of Bluespec that preserves its desirable properties and yet gives direct control over the scheduling decisions that determine performance. Koika has a novel and deterministic operational semantics that uses dynamic analysis to avoid concurrency anomalies. Our implementation includes Coq definitions of syntax, semantics, key metatheorems, and a verified compiler to circuits. We argue that most of the extra circuitry required for dynamic analysis can be eliminated by compile-time BSV-style static analysis.Defense Advanced Research Projects Agency (DARPA) (Grant CCF-1521584)National Science Foundation (Grant HR001118C0018

DSpace@MIT

Crossref

Effective simulation and debugging for a high-level hardware language using software compilers

Author: Arvind
Bourgeat Thomas
Chlipala Adam
Lau Stella
Pit-Claudel Clément
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/06/2022
Field of study

DSpace@MIT

Hydras & Co.: Formalized mathematics in Coq for inspiration and entertainment

Author: Castéran Pierre
Damour Jérémy
Palmskog Karl
Pit-Claudel Clément
Zimmermann Théo
Publication venue: HAL CCSD
Publication date: 01/02/2022
Field of study

International audienceHydras & Co. is a collaborative library of discrete mathematics for the Coq proof assistant, developed as part of the Coq-community organization on GitHub. The Coq code is accompanied by an electronic book, generated with the help of the Alectryon literate proving tool. We present the evolution of the mathematical contents of the library since former presentations at JFLA meetings. Then, we describe how the structure of the project is determined by two requirements which must be continuously satisfied. First, the Coq code needs to be compatible with its ever-evolving dependencies (the Coq proof assistant and several Coq packages both from inside and outside Coq-community) and reverse dependencies (Coq-community projects that depend on it). Second, the book needs to be consistent with the Coq code, which undergoes frequent changes to improve structure and include new material. We believe Hydras & Co. demonstrates that books on formalized mathematics are not limited to providing exposition of theories and reasoning techniquesthey can also provide inspiration and entertainment that transcends educational goals

INRIA a CCSD electronic archive server

HAL Descartes